20 research outputs found

    An Improved Certificateless Generalized Signcryption Scheme

    Get PDF
    Signcryption is basically a cryptographic primitive which provides both signature and encryption functions simultaneously, but it is not useful when only one of the function is required. Generalized Signcryption (GSC) is a special cryptographic primitive which can provide Signcryption function when security and authenticity are needed simultaneously, and can also provide encryption or signature function separately when any one of them is needed. The first Generalized Signcryption was proposed in 2006 by Han et al. Since then many Generalized Signcryption has been proposed based on ECDLP, based on Bilinear Pairing, Identity based and some are also proposed in Certificateless environment. Majority of the Generalized Signcryption schemes uses Random Oracle Model for their security proof and few are proposed based on Standard model. In this thesis we have surveyed the existing GSC schemes and compare their security properties and efficiency. Along with this we also have proposed two schemes of which first one is an Identity based Generalized Signcryption Scheme and second one is a Certificateless Generalized Signcryption Scheme which is a variation of Certificateless Signcryption Scheme by Barbosa et al. We begin by giving formal definition of GSC primitive and complete with comparative study with other models. Finally, we look ahead at what future progress might be made in the field

    Detection of Deceptive Online Reviews using Machine Learning Techniques

    No full text
    In the present day scenario, individuals or decision makers in any organization are very much influenced by online opinion forums as well as review websites for accepting or rejecting any particular item or product. Review websites have become one of the key platforms for consumers to compare products as well as services and consequently give views and experience regarding the same. These customers’ reviews are increasingly used by individuals, manufacturers and retailers before taking any business decisions. As there is no scrutiny over the reviews received, sometimes these reviews leads to review spam. Moreover, driven by the desire of some amount of advantage or publicity or both, spammers produce synthesized reviews to promote or demote certain products or brands. Opinion spamming can hurt both consumers and damage businesses, also it has the potential to create social and political chaos. Hence, they need to be detected to ensure that the social media and/or review sites continue to be a trusted source of public opinions, and hence should not possess any sort of fake or deceptive reviews. As opinions in social media and forums are increasingly used in practice, opinion spamming is becoming more and more sophisticated, which presents a major challenge for their detection. Even though opinion spam or detection of fake reviews has attracted significant research attention in recent years, the problem is still huge and a great deal of research is being carried out by various researchers and practitioners. For mitigation of this type of fake reviews available labeled as well as unlabeled data being generated on daily basis are taken as base for analysis. Keeping these things in mind, this study intends to have in-depth analysis and design of methodology for detecting opinion spams. Based on the degree of availability of labeled data, three major types of Machine Learning (ML) approaches have been considered— Supervised Learning, Semi-supervised Learning and Unsupervised Learning. Our investigation starts with supervised learning techniques to identify review spam, based on labeled data. A unified model has been proposed to filter malicious spam from the genuine ones using the only publicly available standard dataset. Most effective feature sets have been assembled for model building. Sentiment analysis has also been incorporated in the detection process. In order to get best performance, some well-known classifiers have been applied on labeled dataset. It is observed that though n-gram features and linguistic features have been adopted by a few researchers earlier, none of them have proposed any unified model that work universally for all types of data. An accuracy of 92.12% was obtained by the proposed model which is fairly better than existing models and also considers vii the features mentioned individually. One of the major challenge in this area of research is due to unavailability of enoughstandard datasets. Owing to the difficulty of manual labeling of training examples, unsupervised learning techniques have been explored to identify spams using unlabeled data. Amazon’s unlabeled Cell Phone review dataset is used for this purpose. Clustering is used after desired attributes were computed for spam detection. Experimental evaluation resulted in 7.684% of reviews being identified as outliers and hence ruled as spam reviews. In an attempt to utilize minimal labeled data and to take advantage of huge available unlabeled data, semi-supervised learning methods have been applied by using labeled datasets to label rest of the unlabeled data. Five different algorithms have been adapted and applied in order to detect deceptive reviews. The data set used for this purpose is a more varied one and also more number of features has been incorporated for the purpose. From the obtained results, it is observed that the proposed semi-supervised methods are able to classify the datasets efficiently and the performance is better than the existing works. Given the high velocity of review generation on daily basis, handling and processing such huge data becomes the bottleneck. To overcome this problem, Big Data approach is used in which a distributed and scalable cluster like Hadoop is needed for storing (HDFS)and processing (MapReduce as well as Spark) the datasets in an efficient way. Experiments performed show that the proposed big data analytics framework effectively demonstrates the need for developing fitting models and efficiently detecting deceptive reviewers from a big review system data stream

    Nature inspired computing for data science

    No full text

    A model for sentiment and emotion analysis of unstructured social media text

    No full text
    Abstract Sentiment analysis has applications in diverse contexts such as in the gathering and analysis of opinions of individuals about various products, issues, social, and political events. Understanding public opinion can help improve decision making. Opinion mining is a way of retrieving information via search engines, blogs, microblogs and social networks. Individual opinions are unique to each person, and Twitter tweets are an invaluable source of this type of data. However, the huge volume and unstructured nature of text/opinion data pose a challenge to analyzing the data ef?ciently. Accordingly, pro?cient algorithms/computational strategies are required for mining and condensing tweets as well as ?nding sentiment bearing words. Most existing computational methods/models/algorithms in the literature for identifying sentiments from such unstructured data rely on machine learning techniques with the bag-of-word approach as their basis. In this work, w

    Understanding Large-Scale Network Effects in Detecting Review Spammers

    No full text
    Opinion spam detection is a challenge for online review systems and social forum operators. Opinion spamming costs businesses and people money since it deceives customers as well as automated opinion mining and sentiment analysis systems by bestowing undeserved positive opinions on target firms and/or bestowing fake negative opinions on others. One popular detection approach is to model a review system as a network of users, products, and reviews, for example using review graph models. In this article, we study the effects of network scale on network-based review spammer detection models, specifically on the trust model and the SpammerRank model. We then evaluate both network models using two large publicly available review datasets, namely: the Amazon dataset (containing 6 million reviews by more than 2 million reviewers) and the UCSD dataset (containing over 82 million reviews by 21 million reviewers). It has been observed thatSpammerRank model provides a better scaling time for applications requiring reviewer indicators and in case of trust model distributions are flattening out indicating variance of reviews with respect to spamming. Detailed observations on the scaling effects of these models are reported in the result section
    corecore